首页> 外文OA文献 >Multiclass classification of microarray data samples with a reduced number of genes
【2h】

Multiclass classification of microarray data samples with a reduced number of genes

机译:基因数量减少的微阵列数据样本的多类分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background: Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained. Results: A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples. Conclusions: A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples.
机译:背景:基因数量减少的微阵列数据样本的多类分类是生物信息学研究中一个富有挑战的问题。随着班级数量的增加,这个问题变得更加困难。此外,大多数分类器的性能与强制性基因选择方法的有效性紧密相关。基因选择的关键是任何分类算法可以处理的最大基因数目的估计值的可用性。缺乏此类估计可能会导致对计算空间的探索需求(具有数千个维度)或基于无限制大小的基因集的分类模型。在前一种情况下,可能会出现无偏但可能过度拟合的分类模型。在后一种情况下,可能会获得无法支持统计学上重要发现的偏倚分类模型。结果:提出了一种新颖的方法,可以在微阵列数据样本的二进制介导的多类分类算法中由二进制分类器处理的最大基因数目受到限制。边界表明,高维二进制输出域可能会支持存在用于微阵列数据样本的准确且稀疏的二进制介导的多类分类器。结论:全面的实验工作表明,该边界确实可用于诱导微阵列数据样本的准确且稀疏的多类分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号